智能论文笔记

A Machine Learning Approach for Early Detection of Fish Diseases by Analyzing Water Quality

Al-Akhir Nayan , Ahamad Nokib Mozumder , Joyeta Saha , Khan Raqib Mahmud , Abul Kalam Al Azad , Muhammad Golam Kibria

分类：机器学习 | 人工智能

2021-02-15

早期检测鱼类疾病并确定潜在原因对农民来说至关重要，以便采取必要的步骤来减轻潜在的爆发，从而避免对国民经济的明显负面影响的财务损失。通常，鱼类疾病是由病毒和细菌引起的;根据生化研究，某些细菌和病毒的存在可能影响水中的pH，DO，BOD，COD，TSS，TDS，EC，PO43-，NO3-N和NH 3-N的水平，导致死亡鱼。此外，自然过程，例如光合作用，呼吸和分解，也有助于改变水质对鱼类健康产生不利影响。通过最近的机器学习技术成功，本文采用了最先进的机器学习算法，以检测和预测水质的及时和准确地降解水质。因此，它有助于采取潜在鱼类疾病的先发制人。实验结果表明，基于真实数据集的算法检测对水质特异性的鱼类疾病的高精度。

translated by 谷歌翻译

Knowledge, beliefs, attitudes and perceived risk about COVID-19 vaccine and determinants of COVID-19 vaccine acceptance in Bangladesh

Sultan Mahmud , Md. Mohsin , Ijaz Ahmed Khan , Ashraf Uddin Mian , Miah Akib Zaman

分类：人工智能

2021-03-28

A total of 605 eligible respondents took part in this survey (population size 1630046161 and required sample size 591) with an age range of 18 to 100. A large proportion of the respondents are aged less than 50 (82%) and male (62.15%). The majority of the respondents live in urban areas (60.83%). A total of 61.16% (370/605) of the respondents were willing to accept/take the COVID-19 vaccine. Among the accepted group, only 35.14% showed the willingness to take the COVID-19 vaccine immediately, while 64.86% would delay the vaccination until they are confirmed about the vaccine s efficacy and safety or COVID-19 becomes deadlier in Bangladesh. The regression results showed age, gender, location (urban/rural), level of education, income, perceived risk of being infected with COVID-19 in the future, perceived severity of infection, having previous vaccination experience after age 18, having higher knowledge about COVID-19 and vaccination were significantly associated with the acceptance of COVID-19 vaccines. The research reported a high prevalence of COVID-19 vaccine refusal and hesitancy in Bangladesh.

translated by 谷歌翻译

An Empirical Investigation into the Use of Image Captioning for Automated Software Documentation

Kevin Moran , Ali Yachnes , George Purnell , Junayed Mahmud , Michele Tufano , Carlos Bernal-Cárdenas , Denys Poshyvanyk , Zach H'Doubler

分类：人工智能 | 计算机视觉 | 机器学习

2023-01-03

Existing automated techniques for software documentation typically attempt to reason between two main sources of information: code and natural language. However, this reasoning process is often complicated by the lexical gap between more abstract natural language and more structured programming languages. One potential bridge for this gap is the Graphical User Interface (GUI), as GUIs inherently encode salient information about underlying program functionality into rich, pixel-based data representations. This paper offers one of the first comprehensive empirical investigations into the connection between GUIs and functional, natural language descriptions of software. First, we collect, analyze, and open source a large dataset of functional GUI descriptions consisting of 45,998 descriptions for 10,204 screenshots from popular Android applications. The descriptions were obtained from human labelers and underwent several quality control mechanisms. To gain insight into the representational potential of GUIs, we investigate the ability of four Neural Image Captioning models to predict natural language descriptions of varying granularity when provided a screenshot as input. We evaluate these models quantitatively, using common machine translation metrics, and qualitatively through a large-scale user study. Finally, we offer learned lessons and a discussion of the potential shown by multimodal models to enhance future techniques for automated software documentation.

translated by 谷歌翻译

Floods Relevancy and Identification of Location from Twitter Posts using NLP Techniques

Muhammad Suleman , Muhammad Asif , Tayyab Zamir , Ayaz Mehmood , Jebran Khan , Nasir Ahmad , Kashif Ahmad

分类：自然语言处理

2023-01-01

This paper presents our solutions for the MediaEval 2022 task on DisasterMM. The task is composed of two subtasks, namely (i) Relevance Classification of Twitter Posts (RCTP), and (ii) Location Extraction from Twitter Texts (LETT). The RCTP subtask aims at differentiating flood-related and non-relevant social posts while LETT is a Named Entity Recognition (NER) task and aims at the extraction of location information from the text. For RCTP, we proposed four different solutions based on BERT, RoBERTa, Distil BERT, and ALBERT obtaining an F1-score of 0.7934, 0.7970, 0.7613, and 0.7924, respectively. For LETT, we used three models namely BERT, RoBERTa, and Distil BERTA obtaining an F1-score of 0.6256, 0.6744, and 0.6723, respectively.

translated by 谷歌翻译

Skeletal Video Anomaly Detection using Deep Learning: Survey, Challenges and Future Directions

Pratik K. Mishra , Alex Mihailidis , Shehroz S. Khan

分类：计算机视觉

2022-12-31

The existing methods for video anomaly detection mostly utilize videos containing identifiable facial and appearance-based features. The use of videos with identifiable faces raises privacy concerns, especially when used in a hospital or community-based setting. Appearance-based features can also be sensitive to pixel-based noise, straining the anomaly detection methods to model the changes in the background and making it difficult to focus on the actions of humans in the foreground. Structural information in the form of skeletons describing the human motion in the videos is privacy-protecting and can overcome some of the problems posed by appearance-based features. In this paper, we present a survey of privacy-protecting deep learning anomaly detection methods using skeletons extracted from videos. We present a novel taxonomy of algorithms based on the various learning approaches. We conclude that skeleton-based approaches for anomaly detection can be a plausible privacy-protecting alternative for video anomaly detection. Lastly, we identify major open research questions and provide guidelines to address them.

translated by 谷歌翻译

Guidance Through Surrogate: Towards a Generic Diagnostic Attack

Muzammal Naseer , Salman Khan , Fatih Porikli , Fahad Shahbaz Khan

分类：机器学习 | 人工智能 | 计算机视觉

2022-12-30

Adversarial training is an effective approach to make deep neural networks robust against adversarial attacks. Recently, different adversarial training defenses are proposed that not only maintain a high clean accuracy but also show significant robustness against popular and well studied adversarial attacks such as PGD. High adversarial robustness can also arise if an attack fails to find adversarial gradient directions, a phenomenon known as `gradient masking'. In this work, we analyse the effect of label smoothing on adversarial training as one of the potential causes of gradient masking. We then develop a guided mechanism to avoid local minima during attack optimization, leading to a novel attack dubbed Guided Projected Gradient Attack (G-PGA). Our attack approach is based on a `match and deceive' loss that finds optimal adversarial directions through guidance from a surrogate model. Our modified attack does not require random restarts, large number of attack iterations or search for an optimal step-size. Furthermore, our proposed G-PGA is generic, thus it can be combined with an ensemble attack strategy as we demonstrate for the case of Auto-Attack, leading to efficiency and convergence speed improvements. More than an effective attack, G-PGA can be used as a diagnostic tool to reveal elusive robustness due to gradient masking in adversarial defenses.

translated by 谷歌翻译

Blind Restoration of Real-World Audio by 1D Operational GANs

Turker Ince , Serkan Kiranyaz , Ozer Can Devecioglu , Muhammad Salman Khan , Muhammad Chowdhury , Moncef Gabbouj

分类：机器学习

2022-12-30

Objective: Despite numerous studies proposed for audio restoration in the literature, most of them focus on an isolated restoration problem such as denoising or dereverberation, ignoring other artifacts. Moreover, assuming a noisy or reverberant environment with limited number of fixed signal-to-distortion ratio (SDR) levels is a common practice. However, real-world audio is often corrupted by a blend of artifacts such as reverberation, sensor noise, and background audio mixture with varying types, severities, and duration. In this study, we propose a novel approach for blind restoration of real-world audio signals by Operational Generative Adversarial Networks (Op-GANs) with temporal and spectral objective metrics to enhance the quality of restored audio signal regardless of the type and severity of each artifact corrupting it. Methods: 1D Operational-GANs are used with generative neuron model optimized for blind restoration of any corrupted audio signal. Results: The proposed approach has been evaluated extensively over the benchmark TIMIT-RAR (speech) and GTZAN-RAR (non-speech) datasets corrupted with a random blend of artifacts each with a random severity to mimic real-world audio signals. Average SDR improvements of over 7.2 dB and 4.9 dB are achieved, respectively, which are substantial when compared with the baseline methods. Significance: This is a pioneer study in blind audio restoration with the unique capability of direct (time-domain) restoration of real-world audio whilst achieving an unprecedented level of performance for a wide SDR range and artifact types. Conclusion: 1D Op-GANs can achieve robust and computationally effective real-world audio restoration with significantly improved performance. The source codes and the generated real-world audio datasets are shared publicly with the research community in a dedicated GitHub repository1.

translated by 谷歌翻译

Context-Aware Target Classification with Hybrid Gaussian Process prediction for Cooperative Vehicle Safety systems

Rodolfo Valiente , Arash Raftari , Hossein Nourkhiz Mahjoub , Mahdi Razzaghpour , Syed K. Mahmud , Yaser P. Fallah

分类：机器人 | 人工智能

2022-12-24

Vehicle-to-Everything (V2X) communication has been proposed as a potential solution to improve the robustness and safety of autonomous vehicles by improving coordination and removing the barrier of non-line-of-sight sensing. Cooperative Vehicle Safety (CVS) applications are tightly dependent on the reliability of the underneath data system, which can suffer from loss of information due to the inherent issues of their different components, such as sensors failures or the poor performance of V2X technologies under dense communication channel load. Particularly, information loss affects the target classification module and, subsequently, the safety application performance. To enable reliable and robust CVS systems that mitigate the effect of information loss, we proposed a Context-Aware Target Classification (CA-TC) module coupled with a hybrid learning-based predictive modeling technique for CVS systems. The CA-TC consists of two modules: A Context-Aware Map (CAM), and a Hybrid Gaussian Process (HGP) prediction system. Consequently, the vehicle safety applications use the information from the CA-TC, making them more robust and reliable. The CAM leverages vehicles path history, road geometry, tracking, and prediction; and the HGP is utilized to provide accurate vehicles' trajectory predictions to compensate for data loss (due to communication congestion) or sensor measurements' inaccuracies. Based on offline real-world data, we learn a finite bank of driver models that represent the joint dynamics of the vehicle and the drivers' behavior. We combine offline training and online model updates with on-the-fly forecasting to account for new possible driver behaviors. Finally, our framework is validated using simulation and realistic driving scenarios to confirm its potential in enhancing the robustness and reliability of CVS systems.

translated by 谷歌翻译

LMFLOSS: A Hybrid Loss For Imbalanced Medical Image Classification

Abu Adnan Sadi , Labib Chowdhury , Nursrat Jahan , Mohammad Newaz Sharif Rafi , Radeya Chowdhury , Faisal Ahamed Khan , Nabeel Mohammed

分类：计算机视觉 | 人工智能

2022-12-24

Automatic medical image classification is a very important field where the use of AI has the potential to have a real social impact. However, there are still many challenges that act as obstacles to making practically effective solutions. One of those is the fact that most of the medical imaging datasets have a class imbalance problem. This leads to the fact that existing AI techniques, particularly neural network-based deep-learning methodologies, often perform poorly in such scenarios. Thus this makes this area an interesting and active research focus for researchers. In this study, we propose a novel loss function to train neural network models to mitigate this critical issue in this important field. Through rigorous experiments on three independently collected datasets of three different medical imaging domains, we empirically show that our proposed loss function consistently performs well with an improvement between 2%-10% macro f1 when compared to the baseline models. We hope that our work will precipitate new research toward a more generalized approach to medical image classification.

translated by 谷歌翻译

Privacy-Protecting Behaviours of Risk Detection in People with Dementia using Videos

Pratik K. Mishra , Andrea Iaboni , Bing Ye , Kristine Newman , Alex Mihailidis , Shehroz S. Khan

分类：计算机视觉

2022-12-20

People living with dementia often exhibit behavioural and psychological symptoms of dementia that can put their and others' safety at risk. Existing video surveillance systems in long-term care facilities can be used to monitor such behaviours of risk to alert the staff to prevent potential injuries or death in some cases. However, these behaviours of risk events are heterogeneous and infrequent in comparison to normal events. Moreover, analyzing raw videos can also raise privacy concerns. In this paper, we present two novel privacy-protecting video-based anomaly detection approaches to detect behaviours of risks in people with dementia. We either extracted body pose information as skeletons and use semantic segmentation masks to replace multiple humans in the scene with their semantic boundaries. Our work differs from most existing approaches for video anomaly detection that focus on appearance-based features, which can put the privacy of a person at risk and is also susceptible to pixel-based noise, including illumination and viewing direction. We used anonymized videos of normal activities to train customized spatio-temporal convolutional autoencoders and identify behaviours of risk as anomalies. We show our results on a real-world study conducted in a dementia care unit with patients with dementia, containing approximately 21 hours of normal activities data for training and 9 hours of data containing normal and behaviours of risk events for testing. We compared our approaches with the original RGB videos and obtained an equivalent area under the receiver operating characteristic curve performance of 0.807 for the skeleton-based approach and 0.823 for the segmentation mask-based approach. This is one of the first studies to incorporate privacy for the detection of behaviours of risks in people with dementia.

translated by 谷歌翻译